AITopics | case report

Collaborating Authors

case report

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MIMIC-\RNum{4}-Ext-22MCTS: A 22 Millions-Event Temporal Clinical Time-Series Dataset with Relative Timestamp for Risk Prediction

Wang, Jing, Niu, Xing, Zhang, Tong, Shen, Jie, Kim, Juyong, Weiss, Jeremy C.

arXiv.org Artificial IntelligenceNov-19-2025

A crucial component for clinical risk prediction is developing a reliable prediction model is collecting high-quality time series clinical events. In this work, we release such a dataset that consists of 22,588,586 Clinical Time Series events, which we term MIMIC-\RNum{4}-Ext-22MCTS. Our source data are discharge summaries selected from the well-known yet unstructured MIMIC-IV-Note \cite{Johnson2023-pg}. The general-purpose MIMIC-IV-Note pose specific challenges for our work: it turns out that the discharge summaries are too lengthy for typical natural language models to process, and the clinical events of interest often are not accompanied with explicit timestamps. Therefore, we propose a new framework that works as follows: 1) we break each discharge summary into manageably small text chunks; 2) we apply contextual BM25 and contextual semantic search to retrieve chunks that have a high potential of containing clinical events; and 3) we carefully design prompts to teach the recently released Llama-3.1-8B \cite{touvron2023llama} model to identify or infer temporal information of the chunks. The obtained dataset is informative and transparent that standard models fine-tuned on the dataset achieves significant improvements in healthcare applications. In particular, the BERT model fine-tuned based on our dataset achieves 10\% improvement in accuracy on medical question answering task, and 3\% improvement in clinical trial matching task compared with the classic BERT. The dataset is available at https://physionet.org/content/mimic-iv-ext-22mcts/1.0.0. The codebase is released at https://github.com/JingWang-RU/MIMIC-IV-Ext-22MCTS-Temporal-Clinical-Time-Series-Dataset.

clinical event, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2505.00827

Country: North America > United States > Illinois (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)

Add feedback

Arizona sheriff's office utilizing new AI program to assist with writing case reports

FOX NewsOct-10-2025, 07:39:19 GMT

The Pima County Sheriff's Department in Arizona is testing an AI program that drafts police reports in minutes, aiming to boost accuracy and cut time spent on paperwork.

case report, fox new show programming schedule, lifestyle real estate tech science, (6 more...)

FOX News

Country:

North America > United States > Michigan (0.05)
North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > Arizona > Pima County > Tucson (0.05)
Europe > United Kingdom (0.05)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (0.97)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Judging by Appearances? Auditing and Intervening Vision-Language Models for Bail Prediction

Basu, Sagnik, Prakash, Shubham, Barge, Ashish Maruti, Jaiswal, Siddharth D, Dash, Abhisek, Ghosh, Saptarshi, Mukherjee, Animesh

arXiv.org Artificial IntelligenceOct-2-2025

Large language models (LLMs) have been extensively used for legal judgment prediction tasks based on case reports and crime history. However, with a surge in the availability of large vision language models (VLMs), legal judgment prediction systems can now be made to leverage the images of the criminals in addition to the textual case reports/crime history. Applications built in this way could lead to inadvertent consequences and be used with malicious intent. In this work, we run an audit to investigate the efficiency of standalone VLMs in the bail decision prediction task. We observe that the performance is poor across multiple intersectional groups and models \textit{wrongly deny bail to deserving individuals with very high confidence}. We design different intervention algorithms by first including legal precedents through a RAG pipeline and then fine-tuning the VLMs using innovative schemes. We demonstrate that these interventions substantially improve the performance of bail prediction. Our work paves the way for the design of smarter interventions on VLMs in the future, before they can be deployed for real-world legal judgment prediction.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.00088

Country:

North America > United States (0.47)
Asia > India (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Man develops psychosis following ChatGPT's salt-free diet

Breakthroughs, discoveries, and DIY tips sent every weekday. Reducing salt intake is often a solid way to improve your overall health. However, swapping out classic sodium chloride for sodium bromide is a solid way to give yourself acne, involuntary muscle spasms, and paranoid psychosis. Knowing this, it's probably best to avoid that chemical compound entirely--even if ChatGPT tells you otherwise. In the recent case, one patient that was allegedly following the generative AI's nutritional suggestion was placed in hospital's involuntary psychiatric hold for three weeks.

chatgpt, chloride, man develop psychosis, (14 more...)

Popular Science

Genre: Research Report (0.36)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.78)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.78)

Add feedback

Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis

Noroozizadeh, Shahriar, Weiss, Jeremy C.

arXiv.org Artificial IntelligenceAug-6-2025

Clinical case reports and discharge summaries may be the most complete and accurate summarization of patient encounters, yet they are finalized, i.e., timestamped after the encounter. Complementary data structured streams become available sooner but suffer from incompleteness. To train models and algorithms on more complete and temporally fine-grained data, we construct a pipeline to phenotype, extract, and annotate time-localized findings within case reports using large language models. We apply our pipeline to generate an open-access textual time series corpus for Sepsis-3 comprising 2,139 case reports from the Pubmed-Open Access (PMOA) Subset. To validate our system, we apply it on PMOA and timeline annotations from I2B2/MIMIC-IV and compare the results to physician-expert annotations. We show high recovery rates of clinical findings (event match rates: O1-preview--0.755, Llama 3.3 70B Instruct--0.753) and strong temporal ordering (concordance: O1-preview--0.932, Llama 3.3 70B Instruct--0.932). Our work characterizes the ability of LLMs to time-localize clinical findings in text, illustrating the limitations of LLM use for temporal reconstruction and providing several potential avenues of improvement via multimodal integration.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2504.12326

Country: North America > United States (0.93)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Active Learning for Forecasting Severity among Patients with Post Acute Sequelae of SARS-CoV-2

Wang, Jing, Sra, Amar, Weiss, Jeremy C.

arXiv.org Artificial IntelligenceJul-1-2025

The long-term effects of Postacute Sequelae of SARS-CoV-2, known as PASC, pose a significant challenge to healthcare systems worldwide. Accurate identification of progression events, such as hospitalization and reinfection, is essential for effective patient management and resource allocation. However, traditional models trained on structured data struggle to capture the nuanced progression of PASC. In this study, we introduce the first publicly available cohort of 18 PASC patients, with text time series features based on Large Language Model Llama-3.1-70B-Instruct and clinical risk annotated by clinical expert. We propose an Active Attention Network to predict the clinical risk and identify progression events related to the risk. By integrating human expertise with active learning, we aim to enhance clinical risk prediction accuracy and enable progression events identification with fewer number of annotation. The ultimate goal is to improves patient care and decision-making for SARS-CoV-2 patient.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2506.22444

Country: North America > United States (0.29)

Genre: Research Report > New Finding (0.35)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Add feedback

Psychological Counseling Cannot Be Achieved Overnight: Automated Psychological Counseling Through Multi-Session Conversations

Wang, Junzhe, Wang, Bichen, Fu, Xing, Sun, Yixin, Zhao, Yanyan, Qin, Bing

arXiv.org Artificial IntelligenceJun-10-2025

In recent years, Large Language Models (LLMs) have made significant progress in automated psychological counseling. However, current research focuses on single-session counseling, which doesn't represent real-world scenarios. In practice, psychological counseling is a process, not a one-time event, requiring sustained, multi-session engagement to progressively address clients' issues. To overcome this limitation, we introduce a dataset for Multi-Session Psychological Counseling Conversation Dataset (MusPsy-Dataset). Our MusPsy-Dataset is constructed using real client profiles from publicly available psychological case reports. It captures the dynamic arc of counseling, encompassing multiple progressive counseling conversations from the same client across different sessions. Leveraging our dataset, we also developed our MusPsy-Model, which aims to track client progress and adapt its counseling direction over time. Experiments show that our model performs better than baseline models across multiple sessions.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.06626

Country:

North America (0.28)
Asia > Thailand (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

CaseReportBench: An LLM Benchmark Dataset for Dense Information Extraction in Clinical Case Reports

Zhang, Xiao Yu Cindy, Ferreira, Carlos R., Rossignol, Francis, Ng, Raymond T., Wasserman, Wyeth, Zhu, Jian

arXiv.org Artificial IntelligenceMay-26-2025

Rare diseases, including Inborn Errors of Metabolism (IEM), pose significant diagnostic challenges. Case reports serve as key but computationally underutilized resources to inform diagnosis. Clinical dense information extraction refers to organizing medical information into structured predefined categories. Large Language Models (LLMs) may enable scalable information extraction from case reports but are rarely evaluated for this task. We introduce CaseReportBench, an expert-annotated dataset for dense information extraction of case reports, focusing on IEMs. Using this dataset, we assess various models and prompting strategies, introducing novel approaches such as category-specific prompting and subheading-filtered data integration. Zero-shot chain-of-thought prompting offers little advantage over standard zero-shot prompting. Category-specific prompting improves alignment with the benchmark. The open-source model Qwen2.5-7B outperforms GPT-4o for this task. Our clinician evaluations show that LLMs can extract clinically relevant details from case reports, supporting rare disease diagnosis and management. We also highlight areas for improvement, such as LLMs' limitations in recognizing negative findings important for differential diagnosis. This work advances LLM-driven clinical natural language processing and paves the way for scalable medical AI applications.

category, large language model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2505.17265

Country: North America > Canada (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reports

Wu, Kevin, Wu, Eric, Thapa, Rahul, Wei, Kevin, Zhang, Angela, Suresh, Arvind, Tao, Jacqueline J., Sun, Min Woo, Lozano, Alejandro, Zou, James

arXiv.org Artificial IntelligenceMay-21-2025

Doctors and patients alike increasingly use Large Language Models (LLMs) to diagnose clinical cases. However, unlike domains such as math or coding, where correctness can be objectively defined by the final answer, medical diagnosis requires both the outcome and the reasoning process to be accurate. Currently, widely used medical benchmarks like MedQA and MMLU assess only accuracy in the final answer, overlooking the quality and faithfulness of the clinical reasoning process. To address this limitation, we introduce MedCaseReasoning, the first open-access dataset for evaluating LLMs on their ability to align with clinician-authored diagnostic reasoning. The dataset includes 14,489 diagnostic question-and-answer cases, each paired with detailed reasoning statements derived from open-access medical case reports. We evaluate state-of-the-art reasoning LLMs on MedCaseReasoning and find significant shortcomings in their diagnoses and reasoning: for instance, the top-performing open-source model, DeepSeek-R1, achieves only 48% 10-shot diagnostic accuracy and mentions only 64% of the clinician reasoning statements (recall). However, we demonstrate that fine-tuning LLMs on the reasoning traces derived from MedCaseReasoning significantly improves diagnostic accuracy and clinical reasoning recall by an average relative gain of 29% and 41%, respectively. The open-source dataset, code, and models are available at https://github.com/kevinwu23/Stanford-MedCaseReasoning.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.11733

Country: North America > United States > California (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Recommending Clinical Trials for Online Patient Cases using Artificial Intelligence

Chan, Joey, Jin, Qiao, Wan, Nicholas, Floudas, Charalampos S., Xue, Elisabetta, Lu, Zhiyong

arXiv.org Artificial IntelligenceApr-30-2025

Clinical trials are crucial for assessing new treatments; however, recruitment challenges - such as limited awareness, complex eligibility criteria, and referral barriers - hinder their success. With the growth of online platforms, patients increasingly turn to social media and health communities for support, research, and advocacy, expanding recruitment pools and established enrollment pathways. Recognizing this potential, we utilized TrialGPT, a framework that leverages a large language model (LLM) as its backbone, to match 50 online patient cases (collected from published case reports and a social media website) to clinical trials and evaluate performance against traditional keyword-based searches. Our results show that TrialGPT outperforms traditional methods by 46% in identifying eligible trials, with each patient, on average, being eligible for around 7 trials. Additionally, our outreach efforts to case authors and trial organizers regarding these patient-trial matches yielded highly positive feedback, which we present from both perspectives.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2504.20059

Country: North America > United States (0.69)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback